Deep Learning for Mechanical Engineering

Homework 07 - Solution

Due Thursday, 11/18/2021, 2:00 PM


Instructor: Prof. Seungchul Lee
http://iai.postech.ac.kr/
Industrial AI Lab at POSTECH
  • For your handwritten solution, scan or take a picture of them (you can write it in markdown if you want).

  • For your code, only .ipynb file will be graded.

    • Please write your NAME and student ID on your .ipynb files. ex) IljeokKim_20202467_HW07.ipynb
  • Please compress all the files to make a single .zip file

    • Please write your NAME and student ID on your .zip files. ex) DogyeomPark_20202467_HW07.zip
    • Submit it to PLMS
  • Do not submit a printed version of your code. It will not be graded.

Problem 1: Load the dataset¶

We will develop a convolution neural network for classifying images of berry, bird, dog, and flower. Let's download the dataset.
Download data from here . This dateset will be used for Problem 2 and Problem 3.

(1) Design your train, validation, and test dataset using each folder.¶

In [1]:
import numpy as np
import matplotlib.pyplot as plt
from sklearn.metrics import accuracy_score
import cv2
import random

import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Activation, Dense, Flatten, BatchNormalization, Conv2D, MaxPool2D, GlobalAveragePooling2D
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import categorical_crossentropy

import warnings
warnings.filterwarnings(action='ignore')
%matplotlib inline
In [2]:
train_path = './data_files/train'
validation_path = './data_files/validation'
test_path = './data_files/test'
In [3]:
classes=['berry', 'bird', 'dog', 'flower']
train_batches = ImageDataGenerator(rescale = 1./255,) \
    .flow_from_directory(directory=train_path, target_size=(224,224), classes=classes, batch_size=256)
validation_batches = ImageDataGenerator(rescale = 1./255,) \
    .flow_from_directory(directory=validation_path, target_size=(224,224), classes=classes, batch_size=256, shuffle=False)
test_batches = ImageDataGenerator(rescale = 1./255,) \
    .flow_from_directory(directory=test_path, target_size=(224,224), classes=classes, batch_size=256, shuffle=False)
Found 2400 images belonging to 4 classes.
Found 400 images belonging to 4 classes.
Found 800 images belonging to 4 classes.

(2) Plot 10 random data.¶

In [4]:
imgs, labels = next(train_batches)
def plotImages(images_arr):
    fig, axes = plt.subplots(1, 10, figsize=(20,20))
    axes = axes.flatten()
    for img, ax in zip(images_arr, axes):
        ax.imshow(img)
        ax.axis('off')
    plt.tight_layout()
    plt.show()
In [5]:
plotImages(imgs)  

Problem 2: Transfer Learning¶

We will use VGG16 architecture for training our dataset. As you can see below image, VGG16 architecture has 16 layer blcoks and a lot of training parameters. Because machine learning libraries, such as Tensorflow, Keras, and Pytorch, provide pre-trained models for ImageNet, we don't have to design and train the model from scratch.

(3) Design VGG16 model using machine learning libraries, such as Tensorflow, Keras, and Pytorch¶

In [6]:
vgg16_model = tf.keras.applications.vgg16.VGG16()
In [7]:
vgg16_model.summary()
Model: "vgg16"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 224, 224, 3)]     0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
flatten (Flatten)            (None, 25088)             0         
_________________________________________________________________
fc1 (Dense)                  (None, 4096)              102764544 
_________________________________________________________________
fc2 (Dense)                  (None, 4096)              16781312  
_________________________________________________________________
predictions (Dense)          (None, 1000)              4097000   
=================================================================
Total params: 138,357,544
Trainable params: 138,357,544
Non-trainable params: 0
_________________________________________________________________

(4) Modify original VGG16 architecture. As you can see below image, we will modify only fully connected layer part. Also, since we will use pre-trained parameters, parameters of the feature extraction part must be fixed.¶




In [8]:
model = Sequential()
for layer in vgg16_model.layers[:-4]:
    model.add(layer)
In [9]:
for layer in model.layers:
    layer.trainable = False
In [10]:
model.add(Conv2D(filters=1024, kernel_size=(3, 3), padding='same', activation='relu'))
model.add(GlobalAveragePooling2D())
model.add(Dense(units=4, activation='softmax'))
In [11]:
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 56, 56, 256)       295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 56, 56, 256)       590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 28, 28, 256)       0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 28, 28, 512)       1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 28, 28, 512)       2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 14, 14, 512)       0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 14, 14, 512)       2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 7, 7, 512)         0         
_________________________________________________________________
conv2d (Conv2D)              (None, 7, 7, 1024)        4719616   
_________________________________________________________________
global_average_pooling2d (Gl (None, 1024)              0         
_________________________________________________________________
dense (Dense)                (None, 4)                 4100      
=================================================================
Total params: 19,438,404
Trainable params: 4,723,716
Non-trainable params: 14,714,688
_________________________________________________________________

(5) Train the modified VGG16 model. You can get a perfect score if validation accuracy exceeds 90%.¶

In [12]:
model.compile(optimizer=Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(x = train_batches, 
          steps_per_epoch = len(train_batches),
          epochs = 5,
          verbose = 1,
          validation_data = validation_batches
         )
Epoch 1/5
10/10 [==============================] - 170s 17s/step - loss: 1.7026 - accuracy: 0.5275 - val_loss: 0.6652 - val_accuracy: 0.6850
Epoch 2/5
10/10 [==============================] - 184s 18s/step - loss: 0.5284 - accuracy: 0.8042 - val_loss: 0.4036 - val_accuracy: 0.8600
Epoch 3/5
10/10 [==============================] - 180s 18s/step - loss: 0.3588 - accuracy: 0.8654 - val_loss: 0.3770 - val_accuracy: 0.8650
Epoch 4/5
10/10 [==============================] - 183s 18s/step - loss: 0.2962 - accuracy: 0.8942 - val_loss: 0.2767 - val_accuracy: 0.9200
Epoch 5/5
10/10 [==============================] - 188s 19s/step - loss: 0.2474 - accuracy: 0.9087 - val_loss: 0.3015 - val_accuracy: 0.8775
Out[12]:
<keras.callbacks.History at 0x1274702b610>

(6) Print your accuracy about test dataset.¶

In [13]:
test_imgs, test_labels = next(test_batches)
In [14]:
plotImages(test_imgs)
In [15]:
predictions = model.predict(x = test_batches, steps = len(test_batches), verbose = 0)
In [16]:
print('test accuracy: ', accuracy_score(y_true=test_batches.classes, y_pred=np.argmax(predictions, axis=-1)))
test accuracy:  0.89125

Problem 3: Class Activation Maps¶

(7) As shown in the given figure, plot the CAM results.¶

In [17]:
get_output = tf.keras.backend.function([model.layers[0].input],
                                       [model.layers[-3].output, model.layers[-1].output])
conv_outputs, predictionss = [], []

c, p = get_output([test_imgs])
class_weights = model.layers[-1].get_weights()[0]
In [18]:
output = []
for num, idx in enumerate(np.argmax(p, axis=1)):
    cam = tf.matmul(np.expand_dims(class_weights[:,idx], axis = 0),
                    np.transpose(np.reshape(c[num], (7*7,1024))))
    cam = tf.keras.backend.eval(cam)
    cam = np.reshape(cam, (7,7))
    cam = (cam-np.min(cam))/(np.max(cam)-np.min(cam))
    cam = np.expand_dims(np.uint8(255*cam), axis=2)
    cam = cv2.applyColorMap(cv2.resize(cam, (224, 224)), cv2.COLORMAP_JET)
    cam = cv2.cvtColor(cam, cv2.COLOR_BGR2RGB)
    output.append(cam)
In [19]:
idx_list = [random.randint(0,255) for r in range(5)]
idx_list
for i in idx_list:
    plt.figure(figsize=(6,3))
    plt.subplot(121)
    plt.imshow(test_imgs[i])
    plt.title('True: {} / Pred: {}'.format(classes[np.where(test_labels[i] == 1)[0][0]], classes[np.argmax(p[i])]))
    plt.axis('off')
    plt.subplot(122)
    plt.imshow(test_imgs[i])
    plt.imshow(output[i], 'jet', alpha = 0.7)
    plt.title('Class Activation Map')
    plt.axis('off')
    plt.tight_layout()
    plt.show()